智能论文笔记

Adversarial Sampling for Solving Differential Equations with Neural Networks

Kshitij Parwani , Pavlos Protopapas

分类：机器学习 | 人工智能

2021-11-20

基于神经网络的求解方程的方法已经获得了牵引力。他们通过改善每次迭代点的点样本上的神经网络的微分方程残差来工作。然而，其中大多数采用标准采样方案，如统一或扰动同等间隔的点。我们提出了一种新的抽样方案，其采样对逆势地进行对接的，以最大化当前溶液估计的损失。采样器架构与用于训练的丢失术语一起描述。最后，我们证明了该方案通过在许多问题上进行比较来实现预先存在的方案。

translated by 谷歌翻译

A Novel Statistical Independence Test for Dynamic Causal Discovery with Rare Events

Chih-Yuan Chiu , Kshitij Kulkarni , Shankar Sastry

分类： (统计)机器学习 | 机器学习

2022-11-29

Causal phenomena associated with rare events frequently occur across a wide range of engineering and mathematical problems, such as risk-sensitive safety analysis, accident analysis and prevention, and extreme value theory. However, current methods for causal discovery are often unable to uncover causal links between random variables that manifest only when the variables first experience low-probability realizations. To address this issue, we introduce a novel algorithm that performs statistical independence tests on data collected from time-invariant dynamical systems in which rare but consequential events occur. We seek to understand if the state of the dynamical system causally affects the likelihood of the rare event. In particular, we exploit the time-invariance of the underlying data to superimpose the occurrences of rare events, thus creating a new dataset, with rare events are better represented, on which conditional independence tests can be more efficiently performed. We provide non-asymptotic bounds for the consistency of our algorithm, and validate the performance of our algorithm across various simulated scenarios, with applications to traffic accidents.

translated by 谷歌翻译

Broken Neural Scaling Laws

Ethan Caballero , Kshitij Gupta , Irina Rish , David Krueger

分类：机器学习 | 人工智能

2022-10-26

We present a smoothly broken power law functional form that accurately models and extrapolates the scaling behaviors of deep neural networks (i.e. how the evaluation metric of interest varies as the amount of compute used for training, number of model parameters, training dataset size, or upstream performance varies) for each task within a large and diverse set of upstream and downstream tasks, in zero-shot, prompted, and fine-tuned settings. This set includes large-scale vision and unsupervised language tasks, diffusion generative modeling of images, arithmetic, and reinforcement learning. When compared to other functional forms for neural scaling behavior, this functional form yields extrapolations of scaling behavior that are considerably more accurate on this set. Moreover, this functional form accurately models and extrapolates scaling behavior that other functional forms are incapable of expressing such as the non-monotonic transitions present in the scaling behavior of phenomena such as double descent and the delayed, sharp inflection points present in the scaling behavior of tasks such as arithmetic. Lastly, we use this functional form to glean insights about the limit of the predictability of scaling behavior. Code is available at https://github.com/ethancaballero/broken_neural_scaling_laws

translated by 谷歌翻译

Polysemanticity and Capacity in Neural Networks

Adam Scherlis , Kshitij Sachan , Adam S. Jermyn , Joe Benton , Buck Shlegeris

分类：神经与进化计算 | 人工智能 | 机器学习

2022-10-04

Individual neurons in neural networks often represent a mixture of unrelated features. This phenomenon, called polysemanticity, can make interpreting neural networks more difficult and so we aim to understand its causes. We propose doing so through the lens of feature \emph{capacity}, which is the fractional dimension each feature consumes in the embedding space. We show that in a toy model the optimal capacity allocation tends to monosemantically represent the most important features, polysemantically represent less important features (in proportion to their impact on the loss), and entirely ignore the least important features. Polysemanticity is more prevalent when the inputs have higher kurtosis or sparsity and more prevalent in some architectures than others. Given an optimal allocation of capacity, we go on to study the geometry of the embedding space. We find a block-semi-orthogonal structure, with differing block sizes in different models, highlighting the impact of model architecture on the interpretability of its neurons.

translated by 谷歌翻译

Collaborative Human-Robot Exploration via Implicit Coordination

Yves Georgy Daoud , Kshitij Goel , Nathan Michael , Wennie Tabib

分类：机器人

2022-09-19

本文开发了一种协作人类机器人探索的方法，该方法利用了隐式协调。大多数自动的单机器人和多机器人勘探系统都要求远程操作员为机器人团队提供明确的指导。很少有人考虑如何将人类合作伙伴与机器人一起嵌入到该领域的指导。对人类机器人探索的剩下的挑战是从人类到机器人的目标有效沟通。在本文中，我们开发了一种方法论，该方法从人的头上的头盔深度相机到机器人的头盔深度摄像头，以及一个基于信息增益的探索目标，并在人类提供的观点中偏向运动计划。结果是一个安全访问感兴趣区域的空中系统，该区域可能无法立即被人类查看或无法触及。该方法在模拟和运动捕获场中的硬件实验中进行了评估。仿真和硬件实验的视频可在以下网址提供：https：//youtu.be/7jgkbpvfioe。

translated by 谷歌翻译

Hierarchical Collision Avoidance for Adaptive-Speed Multirotor Teleoperation

Kshitij Goel , Yves Georgy Daoud , Nathan Michael , Wennie Tabib

分类：机器人

2022-09-17

本文通过开发一种层次碰撞避免方法来改善基于安全的多旋转器的近电视，该方法根据环境复杂性和感知约束来调节最大速度。在表现出不同混乱的环境中，安全速度调制具有挑战性。现有方法固定了最大速度和地图分辨率，该方法可防止车辆进入狭窄的空间，并将认知负荷置于操作员上的速度。我们通过提出一种高速公路（10 Hz）的远程操作方法来解决这些差距，该方法通过分层碰撞检查调节最大车辆速度。分层碰撞检查器同时适应当地地图的体素尺寸和最大车辆速度，以确保运动计划安全。在模拟和现实世界实验中评估了所提出的方法，并将其与基于非自适应运动原语的远程操作方法进行了比较。结果证明了所提出的详细方法方法的优势以及完成任务的能力，而无需用户指定最大车辆速度。

translated by 谷歌翻译

Net2Brain: A Toolbox to compare artificial vision models with human brain responses

Domenic Bersch , Kshitij Dwivedi , Martina Vilas , Radoslaw M. Cichy , Gemma Roig

分类：计算机视觉 | 人工智能

2022-08-20

我们介绍了Net2Brain，这是一种图形和命令行的用户界面工具箱，用于比较人工深神经网络（DNNS）和人脑记录的代表空间。尽管不同的工具箱仅促进单个功能或仅关注一小部分监督图像分类模型，但Net2Brain允许提取600多个受过培训的DNN的激活，以执行各种视觉相关的任务（例如，语义段，深度估计，深度估计，深度估计，深度估计，估计，深度率，在图像和视频数据集上均具有动作识别等）。该工具箱在这些激活上计算代表性差异矩阵（RDM），并使用代表性相似性分析（RSA），加权RSA（在特定的ROI和探照灯搜索中）将其与大脑记录进行比较。此外，可以在工具箱中添加一个新的刺激和大脑记录数据集以进行评估。我们通过一个示例展示了如何使用Net2Brain的功能和优势来检验认知计算神经科学的假设。

translated by 谷歌翻译

Modelling Social Context for Fake News Detection: A Graph Neural Network Based Approach

Pallabi Saikia , Kshitij Gundale , Ankit Jain , Dev Jadeja , Harvi Patel , Mohendra Roy

分类：自然语言处理

2022-07-27

检测假新闻对于确保信息的真实性和维持新闻生态系统的可靠性至关重要。最近，由于最近的社交媒体和伪造的内容生成技术（例如Deep Fake）的扩散，假新闻内容的增加了。假新闻检测的大多数现有方式都集中在基于内容的方法上。但是，这些技术中的大多数无法处理生成模型生产的超现实合成媒体。我们最近的研究发现，真实和虚假新闻的传播特征是可以区分的，无论其方式如何。在这方面，我们已经根据社会环境调查了辅助信息，以检测假新闻。本文通过基于混合图神经网络的方法分析了假新闻检测的社会背景。该混合模型基于将图形神经网络集成到新闻内容上的新闻和BI定向编码器表示的传播中，以了解文本功能。因此，这种提出的方法可以学习内容以及上下文特征，因此能够在Politifact上以F1分别为0.91和0.93的基线模型和八西八角数据集的基线模型，分别超过了基线模型，分别在八西八学数据集中胜过0.93

translated by 谷歌翻译

Scalable training of graph convolutional neural networks for fast and accurate predictions of HOMO-LUMO gap in molecules

Jong Youl Choi , Pei Zhang , Kshitij Mehta , Andrew Blanchard , Massimiliano Lupo Pasini

分类：机器学习 | 人工智能

2022-07-22

图形卷积神经网络（GCNN）是材料科学中流行的深度学习模型（DL）模型，可从分子结构的图表中预测材料特性。训练针对分子设计的准确而全面的GCNN替代物需要大规模的图形数据集，并且通常是一个耗时的过程。 GPU和分布计算的最新进展为有效降低GCNN培训的计算成本开辟了道路。但是，高性能计算（HPC）资源进行培训的有效利用需要同时优化大型数据管理和可扩展的随机批处理优化技术。在这项工作中，我们专注于在HPC系统上构建GCNN模型，以预测数百万分子的材料特性。我们使用Hydragnn，我们的内部库进行大规模GCNN培训，利用Pytorch中的分布数据并行性。我们使用Adios（高性能数据管理框架）来有效存储和读取大分子图数据。我们在两个开源大规模图数据集上进行并行训练，以构建一个称为Homo-Lumo Gap的重要量子属性的GCNN预测指标。我们衡量在两个DOE超级计算机上的方法的可伸缩性，准确性和收敛性：橡树岭领导力计算设施（OLCF）的峰会超级计算机和国家能源研究科学计算中心（NERSC）的Perlmutter系统。我们通过HydragnN表示我们的实验结果，显示I）与常规方法相比，将数据加载时间降低了4.2倍，而II）线性缩放性能在峰会和Perlmutter上均可训练高达1,024 GPU。

translated by 谷歌翻译

Reward-Sharing Relational Networks in Multi-Agent Reinforcement Learning as a Framework for Emergent Behavior

Hossein Haeri , Reza Ahmadzadeh , Kshitij Jerath

分类：人工智能

2022-07-12

在这项工作中，我们通过用户定义的关系网络将“社交”相互作用集成到MARL设置中，并检查代理与代理关系对新兴行为兴起的影响。利用社会学和神经科学的见解，我们提出的框架模型使用奖励共享的关系网络（RSRN）的构图代理关系，其中网络边缘的权重衡量了一项代理在成功中投入多少代理（或关心“关心） '）其他。我们构建关系奖励是RSRN相互作用权重的函数，以通过多代理增强学习算法共同训练多代理系统。该系统的性能经过了具有不同关系网络结构（例如自我利益，社区和专制网络）的3个代理方案的测试。我们的结果表明，奖励分享关系网络可以显着影响学习的行为。我们认为，RSRN可以充当一个框架，不同的关系网络会产生独特的新兴行为，通常类似于对此类网络的直觉社会学理解。

translated by 谷歌翻译